Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

How independent are the appearances of n-mers in different genomes?

Identifieur interne : 003150 ( Main/Exploration ); précédent : 003149; suivant : 003151

How independent are the appearances of n-mers in different genomes?

Auteurs : Yuriy Fofanov ; Yi Luo ; Charles Katili ; Jim Wang ; Yuri Belosludtsev [États-Unis] ; Thomas Powdrill [États-Unis] ; Chetan Belapurkar ; Viacheslav Fofanov ; Tong-Bin Li ; Sergey Chumakov [Mexique] ; B. Montgomery Pettitt

Source :

RBID : ISTEX:D53F8EB08615F0D9A8366A70938A7F525415F9B9

Descripteurs français

English descriptors

Abstract

Motivation: Analysis of statistical properties of DNA sequences is important for evolutional biology as well as for DNA probe and PCR technologies. These technologies, in turn, can be used for organism identification, which implies applications in the diagnosis of infectious diseases, environmental studies, etc. Results: We present results of the correlation analysis of distributions of the presence/absence of short nucleotide subsequences of different length (‘n-mers’, n = 5 – 20) in more than 1500 microbial and virus genomes, together with five genomes of multicellular organisms (including human). We calculate whether a given n-mer is present or absent (frequency of presence) in a given genome, which is not the usually calculated number of appearances of n-mers in one or more genomes (frequency of appearance). For organisms that are not close relatives of each other, the presence/absence of different 7–20mers in their genomes are not correlated. For close biological relatives, some correlation of the presence of n-mers in this range appears, but is not as strong as expected. Suppressed correlations among the n-mers present in different genomes leads to the possibility of using random sets of n-mers (with appropriately chosen n) to discriminate genomes of different organisms and possibly individual genomes of the same species including human with a low probability of error. Supplementary information: Supplementary data is available at http://www.bioinfo.uh.edu/publications/independence_genomes/.

Url:
DOI: 10.1093/bioinformatics/bth266


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">How independent are the appearances of n-mers in different genomes?</title>
<author>
<name sortKey="Fofanov, Yuriy" sort="Fofanov, Yuriy" uniqKey="Fofanov Y" first="Yuriy" last="Fofanov">Yuriy Fofanov</name>
</author>
<author>
<name sortKey="Luo, Yi" sort="Luo, Yi" uniqKey="Luo Y" first="Yi" last="Luo">Yi Luo</name>
</author>
<author>
<name sortKey="Katili, Charles" sort="Katili, Charles" uniqKey="Katili C" first="Charles" last="Katili">Charles Katili</name>
</author>
<author>
<name sortKey="Wang, Jim" sort="Wang, Jim" uniqKey="Wang J" first="Jim" last="Wang">Jim Wang</name>
</author>
<author>
<name sortKey="Belosludtsev, Yuri" sort="Belosludtsev, Yuri" uniqKey="Belosludtsev Y" first="Yuri" last="Belosludtsev">Yuri Belosludtsev</name>
</author>
<author>
<name sortKey="Powdrill, Thomas" sort="Powdrill, Thomas" uniqKey="Powdrill T" first="Thomas" last="Powdrill">Thomas Powdrill</name>
</author>
<author>
<name sortKey="Belapurkar, Chetan" sort="Belapurkar, Chetan" uniqKey="Belapurkar C" first="Chetan" last="Belapurkar">Chetan Belapurkar</name>
</author>
<author>
<name sortKey="Fofanov, Viacheslav" sort="Fofanov, Viacheslav" uniqKey="Fofanov V" first="Viacheslav" last="Fofanov">Viacheslav Fofanov</name>
</author>
<author>
<name sortKey="Li, Tong Bin" sort="Li, Tong Bin" uniqKey="Li T" first="Tong-Bin" last="Li">Tong-Bin Li</name>
</author>
<author>
<name sortKey="Chumakov, Sergey" sort="Chumakov, Sergey" uniqKey="Chumakov S" first="Sergey" last="Chumakov">Sergey Chumakov</name>
</author>
<author>
<name sortKey="Pettitt, B Montgomery" sort="Pettitt, B Montgomery" uniqKey="Pettitt B" first="B. Montgomery" last="Pettitt">B. Montgomery Pettitt</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:D53F8EB08615F0D9A8366A70938A7F525415F9B9</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1093/bioinformatics/bth266</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HXZ-LCPTC5C8-M/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000012</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000012</idno>
<idno type="wicri:Area/Istex/Curation">000012</idno>
<idno type="wicri:Area/Istex/Checkpoint">000C08</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000C08</idno>
<idno type="wicri:doubleKey">1367-4803:2004:Fofanov Y:how:independent:are</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:15087315</idno>
<idno type="wicri:Area/PubMed/Corpus">002394</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">002394</idno>
<idno type="wicri:Area/PubMed/Curation">002394</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">002394</idno>
<idno type="wicri:Area/PubMed/Checkpoint">002259</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">002259</idno>
<idno type="wicri:Area/Ncbi/Merge">000275</idno>
<idno type="wicri:Area/Ncbi/Curation">000275</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000275</idno>
<idno type="wicri:doubleKey">1367-4803:2004:Fofanov Y:how:independent:are</idno>
<idno type="wicri:Area/Main/Merge">003182</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:05-0062280</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000080</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000014</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000081</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000081</idno>
<idno type="wicri:doubleKey">1367-4803:2004:Fofanov Y:how:independent:are</idno>
<idno type="wicri:Area/Main/Merge">003217</idno>
<idno type="wicri:Area/Main/Curation">003150</idno>
<idno type="wicri:Area/Main/Exploration">003150</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main">How independent are the appearances of
<hi rend="italic">n</hi>
-mers in different genomes?</title>
<author>
<name sortKey="Fofanov, Yuriy" sort="Fofanov, Yuriy" uniqKey="Fofanov Y" first="Yuriy" last="Fofanov">Yuriy Fofanov</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Luo, Yi" sort="Luo, Yi" uniqKey="Luo Y" first="Yi" last="Luo">Yi Luo</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Katili, Charles" sort="Katili, Charles" uniqKey="Katili C" first="Charles" last="Katili">Charles Katili</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jim" sort="Wang, Jim" uniqKey="Wang J" first="Jim" last="Wang">Jim Wang</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Belosludtsev, Yuri" sort="Belosludtsev, Yuri" uniqKey="Belosludtsev Y" first="Yuri" last="Belosludtsev">Yuri Belosludtsev</name>
<affiliation wicri:level="1">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Vitruvius Biosciences, The Woodlands, TX</wicri:regionArea>
<wicri:noRegion>TX</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Powdrill, Thomas" sort="Powdrill, Thomas" uniqKey="Powdrill T" first="Thomas" last="Powdrill">Thomas Powdrill</name>
<affiliation wicri:level="1">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Vitruvius Biosciences, The Woodlands, TX</wicri:regionArea>
<wicri:noRegion>TX</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Belapurkar, Chetan" sort="Belapurkar, Chetan" uniqKey="Belapurkar C" first="Chetan" last="Belapurkar">Chetan Belapurkar</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Fofanov, Viacheslav" sort="Fofanov, Viacheslav" uniqKey="Fofanov V" first="Viacheslav" last="Fofanov">Viacheslav Fofanov</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Li, Tong Bin" sort="Li, Tong Bin" uniqKey="Li T" first="Tong-Bin" last="Li">Tong-Bin Li</name>
<affiliation>
<wicri:noCountry code="no comma">Department of Computer Science and</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Chumakov, Sergey" sort="Chumakov, Sergey" uniqKey="Chumakov S" first="Sergey" last="Chumakov">Sergey Chumakov</name>
<affiliation></affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Mexique</country>
<wicri:regionArea>Department of Physics, University of Guadalajara, Guadalajara</wicri:regionArea>
<wicri:noRegion>Guadalajara</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Pettitt, B Montgomery" sort="Pettitt, B Montgomery" uniqKey="Pettitt B" first="B. Montgomery" last="Pettitt">B. Montgomery Pettitt</name>
<affiliation></affiliation>
<affiliation>
<wicri:noCountry code="subField"></wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j" type="main">Bioinformatics</title>
<title level="j" type="abbrev">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1460-2059</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="e-published">2004</date>
<date type="published">2004</date>
<biblScope unit="vol">20</biblScope>
<biblScope unit="issue">15</biblScope>
<biblScope unit="page" from="2421">2421</biblScope>
<biblScope unit="page" to="2428">2428</biblScope>
</imprint>
<idno type="ISSN">1367-4803</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1367-4803</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Base Sequence</term>
<term>Chromosome Mapping (methods)</term>
<term>Conserved Sequence (genetics)</term>
<term>DNA Fingerprinting (methods)</term>
<term>Evolution, Molecular</term>
<term>Genome</term>
<term>Human</term>
<term>Identification</term>
<term>Models, Genetic</term>
<term>Models, Statistical</term>
<term>Molecular Sequence Data</term>
<term>Nucleotide sequence</term>
<term>Oligonucleotides (genetics)</term>
<term>Polymerase chain reaction</term>
<term>Sequence Alignment (methods)</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Statistical analysis</term>
<term>Statistics as Topic</term>
<term>Virus</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Alignement de séquences ()</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Cartographie chromosomique ()</term>
<term>Données de séquences moléculaires</term>
<term>Modèles génétiques</term>
<term>Modèles statistiques</term>
<term>Oligonucléotides (génétique)</term>
<term>Profilage d'ADN ()</term>
<term>Statistiques comme sujet</term>
<term>Séquence conservée (génétique)</term>
<term>Séquence nucléotidique</term>
<term>Évolution moléculaire</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en">
<term>Oligonucleotides</term>
</keywords>
<keywords scheme="MESH" qualifier="genetics" xml:lang="en">
<term>Conserved Sequence</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>Oligonucléotides</term>
<term>Séquence conservée</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Chromosome Mapping</term>
<term>DNA Fingerprinting</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Analyse statistique</term>
<term>Génome</term>
<term>Homme</term>
<term>Identification</term>
<term>Réaction chaîne polymérase</term>
<term>Séquence nucléotide</term>
<term>Virus</term>
</keywords>
<keywords scheme="Teeft" xml:lang="en">
<term>Algorithms</term>
<term>Average ratio</term>
<term>Base Sequence</term>
<term>Different genomes</term>
<term>Different organisms</term>
<term>Evolution, Molecular</term>
<term>Genome</term>
<term>Genome lengths</term>
<term>Genome sizes</term>
<term>Genomic sequences</term>
<term>Independent events</term>
<term>Microarray</term>
<term>Microarray size</term>
<term>Microbial</term>
<term>Microbial genomes</term>
<term>Models, Genetic</term>
<term>Models, Statistical</term>
<term>Molecular Sequence Data</term>
<term>Multicellular</term>
<term>Multicellular organisms</term>
<term>Ngerprint</term>
<term>Organism</term>
<term>Random boundary</term>
<term>Short subsequences</term>
<term>Solid line</term>
<term>Statistics as Topic</term>
<term>Structural biology</term>
<term>Subsequence</term>
<term>Total number</term>
<term>Viral genomes</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence d'ADN</term>
<term>Cartographie chromosomique</term>
<term>Données de séquences moléculaires</term>
<term>Modèles génétiques</term>
<term>Modèles statistiques</term>
<term>Profilage d'ADN</term>
<term>Statistiques comme sujet</term>
<term>Séquence nucléotidique</term>
<term>Évolution moléculaire</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Homme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Motivation: Analysis of statistical properties of DNA sequences is important for evolutional biology as well as for DNA probe and PCR technologies. These technologies, in turn, can be used for organism identification, which implies applications in the diagnosis of infectious diseases, environmental studies, etc. Results: We present results of the correlation analysis of distributions of the presence/absence of short nucleotide subsequences of different length (‘n-mers’, n = 5 – 20) in more than 1500 microbial and virus genomes, together with five genomes of multicellular organisms (including human). We calculate whether a given n-mer is present or absent (frequency of presence) in a given genome, which is not the usually calculated number of appearances of n-mers in one or more genomes (frequency of appearance). For organisms that are not close relatives of each other, the presence/absence of different 7–20mers in their genomes are not correlated. For close biological relatives, some correlation of the presence of n-mers in this range appears, but is not as strong as expected. Suppressed correlations among the n-mers present in different genomes leads to the possibility of using random sets of n-mers (with appropriately chosen n) to discriminate genomes of different organisms and possibly individual genomes of the same species including human with a low probability of error. Supplementary information: Supplementary data is available at http://www.bioinfo.uh.edu/publications/independence_genomes/.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Mexique</li>
<li>États-Unis</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Belapurkar, Chetan" sort="Belapurkar, Chetan" uniqKey="Belapurkar C" first="Chetan" last="Belapurkar">Chetan Belapurkar</name>
<name sortKey="Fofanov, Viacheslav" sort="Fofanov, Viacheslav" uniqKey="Fofanov V" first="Viacheslav" last="Fofanov">Viacheslav Fofanov</name>
<name sortKey="Fofanov, Yuriy" sort="Fofanov, Yuriy" uniqKey="Fofanov Y" first="Yuriy" last="Fofanov">Yuriy Fofanov</name>
<name sortKey="Katili, Charles" sort="Katili, Charles" uniqKey="Katili C" first="Charles" last="Katili">Charles Katili</name>
<name sortKey="Li, Tong Bin" sort="Li, Tong Bin" uniqKey="Li T" first="Tong-Bin" last="Li">Tong-Bin Li</name>
<name sortKey="Luo, Yi" sort="Luo, Yi" uniqKey="Luo Y" first="Yi" last="Luo">Yi Luo</name>
<name sortKey="Pettitt, B Montgomery" sort="Pettitt, B Montgomery" uniqKey="Pettitt B" first="B. Montgomery" last="Pettitt">B. Montgomery Pettitt</name>
<name sortKey="Wang, Jim" sort="Wang, Jim" uniqKey="Wang J" first="Jim" last="Wang">Jim Wang</name>
</noCountry>
<country name="États-Unis">
<noRegion>
<name sortKey="Belosludtsev, Yuri" sort="Belosludtsev, Yuri" uniqKey="Belosludtsev Y" first="Yuri" last="Belosludtsev">Yuri Belosludtsev</name>
</noRegion>
<name sortKey="Powdrill, Thomas" sort="Powdrill, Thomas" uniqKey="Powdrill T" first="Thomas" last="Powdrill">Thomas Powdrill</name>
</country>
<country name="Mexique">
<noRegion>
<name sortKey="Chumakov, Sergey" sort="Chumakov, Sergey" uniqKey="Chumakov S" first="Sergey" last="Chumakov">Sergey Chumakov</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003150 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 003150 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:D53F8EB08615F0D9A8366A70938A7F525415F9B9
   |texte=   How independent are the appearances of n-mers in different genomes?
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021